Word Sense Disambiguation in Clinical Text
نویسنده
چکیده
Lexical ambiguity, the ambiguity arising from a string with multiple meanings, is pervasive in language of all domains. Word sense disambiguation (WSD) and word sense induction (WSI) are the tasks of resolving this ambiguity. Applications in the clinical and biomedical domain focus on the potential disambiguation has for information extraction. Most approaches to the problem are unsupervised or semi-supervised because of the high cost of obtaining enough annotated data for supervised learning. In this thesis we compare the application of a semi-supervised general domain state of the art WSI method to clinical text to the best known knowledge-based unsupervised methods in the clinical domain. We also explore making improvements to the general domain method, which is based on topic modeling, by adding features that incorporate syntax and information from knowledge bases, and investigate ways to mitigate the need for annotated data. Thesis Supervisor: Peter Szolovits Title: Professor of Computer Science and Engineering, MIT
منابع مشابه
بررسی نقش انواع بافتار همنویسهها در تعیین شباهت بین مدارک
Aim: Automatic information retrieval is based on the assumption that texts contain content or structural elements that can be used in word sense disambiguation and thereby improving the effectiveness of the results retrieved. Homographs are among the words requiring sense disambiguation. Depending on their roles and positions in texts, homograph contexts could be divided to different types, wit...
متن کاملHow much does word sense disambiguation help in sentiment analysis of micropost data?
This short paper describes a sentiment analysis system for micro-post data that includes analysis of tweets from Twitter and Short Messaging Service (SMS) text messages. We discuss our system that makes use of Word Sense Disambiguation techniques in sentiment analysis at the message level, where the entire tweet or SMS text was analysed to determine its dominant sentiment. Previous work done in...
متن کاملPageRank on Semantic Networks, with Application to Word Sense Disambiguation
This paper presents a new open text word sense disambiguation method that combines the use of logical inferences with PageRank-style algorithms applied on graphs extracted from natural language documents. We evaluate the accuracy of the proposed algorithm on several senseannotated texts, and show that it consistently outperforms the accuracy of other previously proposed knowledge-based word sen...
متن کاملWord Sense Disambiguation Using Semantic Graph
This work describes a method of word sense disambiguation by finding similar words in a text. We have used some characteristic properties of the text and its constituent words for the disambiguation task. Using the WordNet, the algorithm constructs a semantic structure on the text illustrating the relations among the words of the text. This structure is then used for disambiguating the constitu...
متن کاملSenseLearner: Word Sense Disambiguation for All Words in Unrestricted Text
This paper describes SENSELEARNER – a minimally supervised word sense disambiguation system that attempts to disambiguate all content words in a text using WordNet senses. We evaluate the accuracy of SENSELEARNER on several standard sense-annotated data sets, and show that it compares favorably with the best results reported during the recent SENSEVAL evaluations.
متن کامل